47 research outputs found
Field-aware Calibration: A Simple and Empirically Strong Method for Reliable Probabilistic Predictions
It is often observed that the probabilistic predictions given by a machine
learning model can disagree with averaged actual outcomes on specific subsets
of data, which is also known as the issue of miscalibration. It is responsible
for the unreliability of practical machine learning systems. For example, in
online advertising, an ad can receive a click-through rate prediction of 0.1
over some population of users where its actual click rate is 0.15. In such
cases, the probabilistic predictions have to be fixed before the system can be
deployed.
In this paper, we first introduce a new evaluation metric named field-level
calibration error that measures the bias in predictions over the sensitive
input field that the decision-maker concerns. We show that existing post-hoc
calibration methods have limited improvements in the new field-level metric and
other non-calibration metrics such as the AUC score. To this end, we propose
Neural Calibration, a simple yet powerful post-hoc calibration method that
learns to calibrate by making full use of the field-aware information over the
validation set. We present extensive experiments on five large-scale datasets.
The results showed that Neural Calibration significantly improves against
uncalibrated predictions in common metrics such as the negative log-likelihood,
Brier score and AUC, as well as the proposed field-level calibration error.Comment: WWW 202
Vision Aided Environment Semantics Extraction and Its Application in mmWave Beam Selection
In this letter, we propose a novel mmWave beam selection method based on the
environment semantics that are extracted from camera images taken at the user
side. Specifically, we first define the environment semantics as the spatial
distribution of the scatterers that affect the wireless propagation channels
and utilize the keypoint detection technique to extract them from the input
images. Then, we design a deep neural network with environment semantics as the
input that can output the optimal beam pairs at UE and BS. Compared with the
existing beam selection approaches that directly use images as the input, the
proposed semantic-based method can explicitly obtain the environmental features
that account for the propagation of wireless signals, and thus reduce the
burden of storage and computation. Simulation results show that the proposed
method can precisely estimate the location of the scatterers and outperform the
existing image or LIDAR based works
Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution
Optimal execution is a sequential decision-making problem for cost-saving in
algorithmic trading. Studies have found that reinforcement learning (RL) can
help decide the order-splitting sizes. However, a problem remains unsolved: how
to place limit orders at appropriate limit prices? The key challenge lies in
the "continuous-discrete duality" of the action space. On the one hand, the
continuous action space using percentage changes in prices is preferred for
generalization. On the other hand, the trader eventually needs to choose limit
prices discretely due to the existence of the tick size, which requires
specialization for every single stock with different characteristics (e.g., the
liquidity and the price range). So we need continuous control for
generalization and discrete control for specialization. To this end, we propose
a hybrid RL method to combine the advantages of both of them. We first use a
continuous control agent to scope an action subset, then deploy a fine-grained
agent to choose a specific limit price. Extensive experiments show that our
method has higher sample efficiency and better training stability than existing
RL algorithms and significantly outperforms previous learning-based methods for
order execution
Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning
In the field of quantitative trading, it is common practice to transform raw
historical stock data into indicative signals for the market trend. Such
signals are called alpha factors. Alphas in formula forms are more
interpretable and thus favored by practitioners concerned with risk. In
practice, a set of formulaic alphas is often used together for better modeling
precision, so we need to find synergistic formulaic alpha sets that work well
together. However, most traditional alpha generators mine alphas one by one
separately, overlooking the fact that the alphas would be combined later. In
this paper, we propose a new alpha-mining framework that prioritizes mining a
synergistic set of alphas, i.e., it directly uses the performance of the
downstream combination model to optimize the alpha generator. Our framework
also leverages the strong exploratory capabilities of reinforcement
learning~(RL) to better explore the vast search space of formulaic alphas. The
contribution to the combination models' performance is assigned to be the
return used in the RL process, driving the alpha generator to find better
alphas that improve upon the current set. Experimental evaluations on
real-world stock market data demonstrate both the effectiveness and the
efficiency of our framework for stock trend forecasting. The investment
simulation results show that our framework is able to achieve higher returns
compared to previous approaches.Comment: Accepted by KDD '23, ADS trac
Adaptive-Step Graph Meta-Learner for Few-Shot Graph Classification
Graph classification aims to extract accurate information from
graph-structured data for classification and is becoming more and more
important in graph learning community. Although Graph Neural Networks (GNNs)
have been successfully applied to graph classification tasks, most of them
overlook the scarcity of labeled graph data in many applications. For example,
in bioinformatics, obtaining protein graph labels usually needs laborious
experiments. Recently, few-shot learning has been explored to alleviate this
problem with only given a few labeled graph samples of test classes. The shared
sub-structures between training classes and test classes are essential in
few-shot graph classification. Exiting methods assume that the test classes
belong to the same set of super-classes clustered from training classes.
However, according to our observations, the label spaces of training classes
and test classes usually do not overlap in real-world scenario. As a result,
the existing methods don't well capture the local structures of unseen test
classes. To overcome the limitation, in this paper, we propose a direct method
to capture the sub-structures with well initialized meta-learner within a few
adaptation steps. More specifically, (1) we propose a novel framework
consisting of a graph meta-learner, which uses GNNs based modules for fast
adaptation on graph data, and a step controller for the robustness and
generalization of meta-learner; (2) we provide quantitative analysis for the
framework and give a graph-dependent upper bound of the generalization error
based on our framework; (3) the extensive experiments on real-world datasets
demonstrate that our framework gets state-of-the-art results on several
few-shot graph classification tasks compared to baselines
Potential of Core-Collapse Supernova Neutrino Detection at JUNO
JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve